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1. INTRODUCTION 

Generally, to access a system service, users need an account that contains a username 
and password [1, 2]. The main key to securing an account is a password. At present, passwords are one 
of the popular authentication methods [3-5]. Usually, the contents of the password used by the user contain 
a variety of information they have (or what they know), such as full name, date of birth to the name of his 
parents [6, 7]. Passwords are a simple authentication method that is very easy to implement. That is why 
there are still many systems in cyberspace utilizing this conventional method. Because of its ease 
of implementation, many ways can be done to guess passwords from system users such as dictionary attack 
and brute force attacks [8, 9]. However, there is a technique that can be done so that the account is not easily 
broken into by adding some special characters (example: "<% $ @!") [3]. But, it is not easy to remember for 
users, because users must remember the characters they use every time they log in to the system [1] and they 
cannot easy log into the system [9]. And if the user has not used his account for a long time, there is an 
indication that the user will not be able to log into the system because he/she has forgotten the password 
used. So, it makes users frustrations because cannot log into the system [10]. 

Keystroke Dynamic Authentication (KDA) is one of the right solutions in several previous 
problems. KDA is an authentication technique that utilizes the habit of typing someone as a login parameter 
from the user of a system [6, 11, 12]. The purpose of KDA is to increase the security of using passwords that 
have been widely used and handle various account security issues that are often broken into by irresponsible 
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users (hackers or attackers) [3, 6]. KDA is one of the Biometric Authentication techniques. Biometric 
Authentication utilizes something unique from users such as the face, fingerprints, and habits (in this case 
KDA) [3, 6, 13]. And every person face, fingerprints, and habits can not be imitated by others (one of 
the habits is typing characters using the keyboard or KDA). This also shows that the application of KDA to 
a system is very safe [14]. Then, the main reason for using KDA in this research is that it does not require 
expensive costs (low costs) and does not need any additional devices [14-16] (only uses the keyboard). 
This differentiates KDA with another Biometric Authentication which using adding devices (such as face or 
fingerprints) [17]. Another advantage of KDA is that the characters used in the password do not have to 
utilize special characters, but can use the alphabet and numeric characters [6, 15]. Because utilizes the KDA 
method, users who enter into the system will not realize that the system they are using has used the KDA 
method for their account security. 

There are KDA researches that utilize the Scaled Manhattan method [3, 18]. They utilize 
the average in the research conducted. The use of averages has a weakness for data streams such as KDA ie 
the value does not change with time [15]. That is, a user's typing speed will change over time (maybe faster 
or slower depending on certain conditions). The use of averages is not suitable for this problem, so we 
propose the use of Mean of Horner's Rules (MHR) which can adapt to changes in values over time. Also, by 
using MHR on KDA, it can improve accuracy in the classification between attackers and users rather than 
using averages [6, 15]. So in this research, we will do a combination of the Scaled Manhattan method and 
MHR to improve accuracy in the classification between attackers and users. And, for more details on 
the methods used, the final results and discussion of this research can be seen in the next chapter. 


2. RESEARCH METHOD 

This research uses Scaled Manhattan Distance [3, 19] combined with Mean of Horner’s Rules 
(MHR) [15]. The purpose of this combination is to improve the accuracy of the classification between 
attackers and users. This has been proven from the results of research from Chandranegara and Sumadi [6] 
that utilize a combination of MHR and the accuracy of the methods developed is improved compared 
to the previous method. Where the classification method used without MHR produces an accuracy of 
approximately 75% and when combined with MHR it becomes approximately 93% (increasing by 18%). 
While the Dynamic Keystroke data used is derived from the results of Killourhy and Maxion [19]. 
Following is the formula of the Scaled Manhattan Distance method [3, 19]: 


Øn = Xil fon — nl /an (1) 


where p is total of training data and n is a feature of the data. Whereas f_(p,n) is the training data of the nth 
feature with p=l,...p. (g_n ) is the average of training data per feature and a_n is the absolute deviation 
of training data per feature. To get an absolute deviation you can use a formula like the following [3, 19]: 


1 is 
an = qui [fan — In l (2) 


where q is total of training data and n is a feature of the data. f,,, is the training data of the nth feature with 
q=1,...q. Furthermore, to find the MHR value, the following formula can be used [6, 15]: 


MHR= 
2 


where X, is the nth data from the training data. 


This research proposes a combination of Mean of Horner's Rules (MHR) which can be seen 
as follows : 


Øn = Ea ae = MBR,,|/(an) (4) 
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where this combination is done by replacing the average value with MHR. The purpose of using this MHR is 
to improve the accuracy of the previous method. This is reinforced from the results of Chandranegara and 
Sumadi's research [6] which states that accuracy increases by replacing the average using MHR. For 
classification between attackers and users we use classifications like the following [6]: 

- If|MHR,, —T,,| < Øn, then the user is considered as an actual user. 

- If|MHR,, —T,,| > Øn, then the user is considered as an attacker. 

Where T is the testing data and n is a feature of the testing data. 

As an evaluation of KDA method which aims to find out how good the proposed KDA method is in 
accepting users or rejecting attackers, in this research we use FAR (False Acceptance Rate) and FRR 
(False Rejected Rate) values [15, 20, 21]. FAR is a possible system/method for accepting an attacker 
as a user [15, 20, 21]. Whereas FRR is the possibility of a system/method to reject users and detect them 
as attackers [15, 20, 21]. How to get the FAR and FRR values can be seen in formulas (5) and (6), 
provided that the smaller the value of the FAR or FRR, the better the results of the KDA classification 
applied [6, 22]. 


number of acceptance attacker 


FAR = total number of attacker 5) 
_ number of rejected user 
FRR = total number of user (6) 
In addition to FAR and FRR, we also evaluate using accuracy with the following formula [6]: 
Accuracy = (| x100% (7) 
TP+FP+TN+FN 


where TP (True Positive), TN (True Negative), FP (False Positive), and FN (False Negative). 

TP is the success to accept users as actual users and TF is the success to detect attackers. Whereas FP is 

a misclassification for accepting an attacker and detecting it as a actual user and FN is a misclassification for 

refusing an actual user and detecting it as an attacker. To get the accuracy value as explained before, 

we use several scenarios like the following: 

a. Dynamic Keystroke Data is divided into 2 types 1.e. training data and testing data. 

b. Data Training for every user is the first 350 data from a dataset. For illustration training, User “A” 
training data uses data from 1 to 350, from a total of 400 KDA data. 

c. Data Testing for every user is the last 50 data from a dataset. For illustration testing, User "A" uses KDA 
data from 350 to 400, from a total of 400 KDA data as testing data. 

d. Furthermore, each user in this data will be used as an attacker for every other user. Thus, as many as 
51 attack scenarios will be formed (where the total users of the data used are 51 people). And the attacker 
data used is the last 50 data of dataset that is used as an attacker. 

Testing methods used in this research use a program (using php programming) that is made in 
accordance with the proposed method and previous methods and adapted to predetermined scenarios. 


3. RESULTS AND ANALYSIS 
3.1. Dataset 
This research uses Keystroke Dynamic data from Killourhy and Maxion [19]. In this data, there are 
51 users (30 male and 21 female) and each user has 400 Dynamic Keystroke data. This data was obtained 
by them within 8 days, where every day obtained Dynamic Keystroke data as much as 50 data perusers. 
The time used in this data is seconds. The character used in Keystroke Dynamic data recording 
is "tieSRoanl". The use of this character has also been based on several attempts and the result is that this 
character has a high level of password security. In this data, each user types characters and records them. 
There are several important feature elements contained in this data, i.e. [6, 7, 20, 23, 24, 25] (illustration of 
these important features and contained from the data used can be seen in Figure 1): 
a. Hold time (H) is the time needed to press a character (Key-Down to Key-Up). 
b. Up-Down (UD) is the time between releasing (Key-Up) characters to pressing the next (Key-Down) 
character or commonly referred to as Latency Time. 
c. Down-Down (DD) is the time taken when pressing the first character (Key-Down) to press the second 
character (Key-Down) or commonly referred to as Flight Time. 
Total features in this data are 31 features. Where each feature consists of: 
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a. Holdtime character ".", Character "t" to the last character that is "1" and pressing the "return" button 
is also included. So that the total is 11 features. 

b. Up-Down (UD) characters "." And "t" to UD between the last character with the "return" button. So that 
the total is 10 features. 

c. Down-Down (DD)/Flight Time between the characters "." Until the last character, "I" and pressing 
the "return" button are also entered. So the total is 11. 


flight time 


Key Down Key Up Key Down Key Up 


| r | 


latency time 





e _____> 
hold time hold time 


Figure 1. Illustration of KDA Features [6] 


3.2. Result and analysis 

Based on the test results, the proposed method produces a less good accuracy of 50.113%. 
While the accuracy of the previous method is 50.335%. However, the FAR value of the proposed method has 
decreased from the previous method (the proposed method has a FAR value of 0.976 and the previous 
method has a FAR value of 0.98). The increase does not occur at accuracy but in the FAR value. Because it 
has not shown better accuracy, we have modified the proposed method by adding coefficient 5 and can 
be seen in formula (7). 


Øn = Sn lina = MHR,,|/(5 * an) (7) 


After the modification, the accuracy is quite high. The reason for using the coefficient number 5 is 
based on several experiments using other coefficients from 1 to 7 (The results of the coefficient experiment 
can be seen in Table 1). Based on the test results, it appears that coefficient 5 has FAR and FRR values of 
0.356 and 0.305 see Table 1. The FAR and FRR values of the coefficient 5 show almost the same value and 
can be said to be balanced. Whereas in other coefficients, the FAR value is low but the FRR value is high and 
vice versa, the FRR value is low but the FAR value is high. 


Table 1. Result of using coeffesien in proposed method 








Coeffesien FAR FRR 
1 0.976 0.021 
2 0.863 0.082 
3 0.673 0.172 
4 0.490 0.241 
5 0.356 0.305 
6 0.251 0.367 
7 0.179 0.425 





There are reasons why we don't use other coefficients that have the lowest FAR or FRR values, i.e.: 
a. If the FAR value is high, then the possibility of the system accepting the attacker as an actual user 
is higher. 
b. If the FRR value is high, then the possibility of the system rejecting actual users or assuming actual users 
as attackers are higher. 
These two reasons are our main benchmarks for using coefficient 5. Also, this reason is based on the results 
of previous studies [6, 15]. The results of the test in the form of accuracy using the proposed method given 
coefficients and the previous method have been presented in Table 2. And the results of this test 
are the average accuracy obtained from 51 preplanned scenarios. 
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Table 2. Results of Research 








Method Average of Accuracy (%) 
Scaled Manhattan Distance 50.335 
Combined Scaled Manhattan Distance with MHR (Coeffesien=5) 66.963 





4. CONCLUSION 

Based on the results of the research conducted, it appears that the accuracy of the proposed method 
has not increased compared to the previous method. This is because the value of the Scaled Manhattan 
Distance Modification produced is less suitable for accepting users and rejecting attackers. So we try to add 
coefficient to increase the value. And the results show that its accuracy can be increased even if not 
significantly. And the best coefficient used in this proposed method is number 5. Because based on 
the results of tests conducted previously, shows that coefficient 5 gives the smallest FAR and FRR values 
compared to other coefficients. However, although accuracy does not increase if it does not add coefficient, 
this proposed method can reduce the FAR (false acceptance rate). Means, the proposed method without 
coefficient has a good result on FAR but not on the accuracy value. 

In next research, it is expected to be able to add feature selection so that the computational 
classification is reduced and can also select features that are considered important in Keystroke Dynamic 
Authentication. Also, we can do some modifications to other methods that apply averages as their 
classification. And based on our research, the accuracy value cannot be used as a benchmark that the method 
is good or not, but we can use other parameters besides accuracy as in the case of KDA namely the FAR and 
FRR values. Then based on the results of this research, this proposed method can be applied to real 
or desktop-based login systems. And users will not be aware if the login method has been applied Keystroke 
Dynamic Authentication security. 
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